About this primer
Ever since the first genome-wide association study (GWAS) on age-related macular degeneration, and the promise of personalized medicine in the wake of the Human Genome Project, large-scale genetic association studies hold significant sway in contemporary health research and drive drug-development pipelines. In the past 2 decades, researchers delved into GWAS, aiming to unveil genetic variations linked to both human traits, such as the color of your eyes, and rare and common complex diseases. These findings serve as crucial keys to unravel the intricate mechanisms underlying diseases, shedding light on whether the correlations identified in observational studies between risk factors and diseases are truly causal.
These studies have ushered in an exciting era where many researchers thrive on developing new methods and bioinformatic tools to parse ever-growing large datasets collected large population-based biobanks. However, the analyses of these data are challenging and it can be daunting to see the forest for tree among the many tools and their various functions. Enter A Practical Primer in Human Complex Genetics. This GitBook was originally written back in 2022 for the Genetic Epidemiology course organized by the Master Epidemiology of Utrecht University. This practical guide will teach you how to design a GWAS, perform quality control (QC), execute the actual analyses, annotate the GWAS results, and perform further downstream post-GWAS analyses. Throughout the book you’ll work with ‘dummy’, that is fake, data, but in the end, we will use real-world data from the first release of the Welcome Trust Case-Control Consortium (WTCCC) focusing on coronary artery disease (CAD).
A major component of modern-day GWAS is genetic imputation, but for practical reasons it is not part of this book. However, I will provide some pointers as to how to go about do this with minimal coding or scripting experience. Likewise, the courses does not cover the aspects of meta-analyses of GWAS, but some excellent resources exist to which I will direct. As this practical primer evolves, these and other topics may find their place in this book.
I should also point out that emphasis of this book is on it being a practical primer. It is intended to provide some practical guidance to doing GWAS, and while theory is important, I will not cover this. Again, some very useful and excellent work exists to which I will point you, but I really want you to learn - and understand the theory - by doing.
So, although originally crafted as a companion for the course, this practical guide stands on its own as a comprehensive resource for diving into all facets of doing a GWAS — save for experimental follow-up, of course 😉.
I can imagine this seems overwhelming, but trust me, you’ll be okay. Just follow this practical. You’ll learn by doing and at the end of the day, you can execute a GWAS independently.
Ready to start?
Some background reading
Standing on the shoulders of giants, that’s what this book and I do. I want to acknowledge some great work that has helped me tremendously and, really, this book wouldn’t exist without this awesome work. So, I do want to give you some background reading. Is it a prerequisite? No, not really. For starters, the course covers most and you’ll learn as you go. And if you didn’t come here through the course, you’ll be fine just the same. That said, it’s a always good idea to get familiar with these works as you move forward on your path towards your first GWAS - in fact, I had these printed out with markings and writings all over them as I executed my first GWAS, and they’ve been great as a reference many times after.
Large parts of this work are based on four awesome Nature Protocols from the Zondervan group at the Wellcome Center Human Genetics.
- Zondervan KT et al. Designing candidate gene and genome-wide case-control association studies. Nat Protoc 2007.
- Pettersson FH et al. Marker selection for genetic case-control association studies. Nat Protoc 2009.
- Anderson CA et al. Data QC in genetic case-control association studies. Nat Protoc 2010.
- Clarke GM et al. Basic statistical analysis in genetic case-control studies. Nat Protoc 2011.
An update on the community standards of QC for GWAS can be found here:
- Laurie CC et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet Epidemiol 2010.
With respect to imputation and meta-analyses of GWAS you should also get familiar with the following two works:
- Marchini, J. and Howie, B. Genotype imputation for genome-wide association studies. Nat Rev Genet 2010
- de Bakker PIW et al. Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum Mol Genet 2008.
- Winkler TW et al. Quality control and conduct of genome-wide association meta-analyses. Nat Protoc 2014.
Are you ready?
Are you ready? Did you bring coffee and a good dose of energy? Let’s start! Your first point of action is to prepare your system for this course in Chapter 3.
Getting started
[THIS CHAPTER NEEDS WORK]
- introduction
- briefly touch on operating system
- split CoCalc vs standalone
- CoCalc: everything is installed
- standalone:
- focus on macOS
- what to install
- show how to navigate on macOS Terminal
[Some introductory text]
Your computer
Before getting started, we need to discuss your computer. Most programs made to execute genetic epidemiology studies are developed for the Unix environment, for example Linux and macOS. So, they may not work as intended in a Windows environment. Windows does allow users to install a linux subsystem within Windows 10+ and you can find the detail guide here.
However, I highly recommend one of two options.
- One, install a linux subsystem on your Windows computer (for example a virtual machine with Ubuntu could work).
- Two, switch to macOS in combination with homebrew. This will give you all the flexibility to use Unix-based programs for your genetic epidemiology work and at the same time you’ll keep the advantage of a powerful computer with a user-friendly interface.
I chose the latter.
For this practical every command is intended for Linux/macOS, in other words Unix-systems.
CoCalc vs. Standalone
For the purpose of this practical primer there are one of two steps you need to take to get started. When you are following the course, you will want to read the section CoCalc. When you want to use this book as a standalone, you should check out the instructions in section Standalone - this is probably also the section you want to follow for real-world cases.
But first, I’ll briefly provide some background on the various programs that are commonly used.
The programs we use
We’ll use a few programs throughout this practical. You’ll probably need these for your (future) genetic epidemiology work too (Table 3.1).
Table 3.1: Programs needed for genetic epidemiology.
Program | Link | Description |
|---|
PLINK | https://www.cog-genomics.org/plink2/ | PLINK is a free, open-source genetic analysis tool set, designed to perform a range of basic data parsing and quality control, as well as basic and large-scale analyses in a computationally efficient manner. |
R | https://cran.r-project.org/ | A program to perform statistical analysis and visualizations. |
RStudio | https://www.rstudio.com | A user-friendly R-wrap-around for code editing, debugging, analyses, and visualization. |
Homebrew | https://brew.sh | A great extension for Mac-users to install really useful programs that Apple didn't. |
RStudio
RStudio is a very user-friendly interface around R that makes your R-scripting-life a lot easier. You should get used to that. RStudio comes with R so you don’t have to worry about that.
PLINK
Right, onto PLINK.
All genetic analyses can be done in PLINK, even on your laptop, but with large datasets, for example UK Biobank size, it is better to switch to a high-performance computing cluster (HPC) like we have available at the Utrecht Science Park. The original PLINK v1.07 can be found here, but nowadays we are using a newer, faster version: PLINK v1.9 which can be found here. It still says ‘PLINK 1.90 beta’ (Figure 3.1), but you can consider this version stable and save to work with, but as you can see, some functions are not supported anymore.
Alternatives to PLINK
Nowadays, a lot of people also use programs like SNPTEST, BOLT-LMM, GCTA, or regenie as alternatives to execute GWAS. These programs were designed with specific use-cases in mind, for instance really large biobank data including hundreds of thousands individuals, better control for population stratification, the ability to estimate trait heritability or Fst, and so on.
Other programs
Mendelian randomization can be done either with the SMR or GSMR function from GCTA, or with R-packages, like TwoSampleMR.
CoCalc
[ TEXT NEEDS UPDATING]
Now, pay attention. If you came here through the course Genetic Epidemiology, you don’t have to do anything. All the data you need are already downloaded.
However, when you are using this book as a standalone, you’ll need to start by downloading the data you need for this practical to your Desktop.
For the course we set up a CoCalc Server and everything should be fine; we installed everything you need.
Standalone
So, you plan to use this book as ‘Standalone’ on a macOS environment. This means you’ll need to install a few things first.
The data you need
You’ll need to start by downloading the data you need for this practical to your Desktop.
Here’s the link to the data.
Link to Google Drive with data
Make sure you put the data in the ~/Desktop/practical/ folder.
The data are pretty large (approx. 15Gb), so this will take a minute or two depending on your internet connection. Time to stretch your legs or grab a coffee (data scientists don’t drink tea).
Terminal
For all the programs we use, except RStudio, you will need the Terminal. This comes with every major operating system; on Windows it is called ‘PowerShell’, but let’s not go there. And regardless, you will (have to start to) make your own scripts. The benefit of using scripts is that each step in your workflow is clearly stipulated and annotated, and it allows for greater reproducibility, easier troubleshooting, and scaling up to high-performance computer clusters.
Open the Terminal, it should be on the left in the toolbar as a little black computer-monitor-like icon. Mac users can type command + space and type terminal, a Terminal screen should open.
From now on we will use little code blocks like the example to indicate a code you should type/copy-paste and hit enter. If a code is followed by a comment, it is indicated by a # - you don’t need to copy-paste and execute this.
CODE BLOCK
CODE BLOCK # some comment here
Navigating the Terminal
You can navigate around the computer through the terminal by typing cd <path>; cd stands for “change directory” and <path> means “some_file_directory_you_want_to_go_to”.
This command will bring you to your home directory.
cd ~
This will bring you to the parent directory (up one level).
cd ../
This will bring you to the XXX directory.
cd XXX
Let’s navigate to the folder you just downloaded.
cd ~/Desktop/practical
Let’s check out what is inside the directory, by listing (ls) its contents.
This command shows files as list; the -l makes it a vertical list and adds more information, you can also remove it and simply type ls - go on, and try.
ls -l
This command shows files as list with human readable format.
ls -lh
Adding the flags -lh will get you the contents of a directory in a list (-l) and make the size ‘human-readable’ (-h).
Adding -t shows the files as list sorted by time edited.
ls -lt
Adding -S shows the files as list sorted by size.
ls -lS
You can also count the number of files. Just ‘pipe’ the result from ls to the next program wc (‘wordcount’) and list the number of lines, -l. In this case -l is a flag used by wc and it has a different meaning than it does for ls.
ls | wc -l
And if you want to know all the function of a program simply type the following.
man ls
This will take you to a manual of the program with an extensive description of each flag (Figure 3.2).
Installing the software
brew
Linux has a great package-manager that is lacking on macOS. You can install brew to compensate for this. This adds the ability to install almost any Linux-based program through the Terminal such as wget, llvm, etc.
Open Terminal and execute the following:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Check if everything is in order.
brew doctor
It shouldn’t report any errors.
PLINK
First, we’ll get PLINK. Navigate to the PLINK v1.9 website, which can be found here. Download the macOS (64-bit) version under ‘Stable (beta x.x, day month year)’.
Note: Apple produced Intel-based computers for a few years back, and most programs, packages, libraries and whatnot are designed for that. So, I highly recommend using software designed for that and activating Rosetta2 in your Terminal. Don’t know how to do that? Following these instructions.
Unzip the folder and put plink in the practical folder.
mv -v ~/Downloads/plink_mac_20231211/plink ~/Desktop/practical/plink
Installing R and RStudio
Let’s go ahead and use brew to install the R and RStudio software.
In Terminal execute the following and just follow the instructions.
brew install rstudio
brew install --cask r
Now close the terminal window - really make sure that the terminal-program has quit.
Open your fresh installation of RStudio by double clicking the icon. You should be seeing something like figure 3.3
In the top right, you see a little green-white plus-sign, click this and select ‘R Notebook’ (Figure 3.4).
You will create an untitled (Untitled1) R notebook: you can combine text descriptions, like you would in a lab-journal, with code-sections. Read what is in the notebook to get a grasp on that (Figure 3.5).
Right, you should be installing some packages. To do so, you can remove plot(cars) (or leave and create a new code-block as per instructions in the notebook), and copy paste the code below. Make sure to put in a code block like the example in which plot(cars) is in.
remotes::install_github(c("rstudio/rmarkdown"))
install.packages(c("formatR", "remotes",
"httr", "usethis",
"data.table", "devtools",
"dplyr", "tibble", "tidyverse",
"openxlsx",
"ggplot2",
"ggsci", "ggthemes",
"qqman", "CMplot", "plotly",
"openxlsx"))
devtools::install_github("kassambara/ggpubr")
devtools::install_github("oliviasabik/RACER")
remotes::install_github("MRCIEU/TwoSampleMR")
devtools::install_github("MRCIEU/MRInstruments")
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("geneplotter")
You should load these packages too.
library(rmarkdown)
library(formatR)
library(openxlsx)
library(data.table)
library(tibble)
library(tidyverse)
library(dplyr)
library(plotly)
library(ggplot2)
library(devtools)
library(ggpubr)
library(ggsci)
library(ggthemes)
library(qqman)
library(CMplot)
library(RACER)
library(remotes)
library(TwoSampleMR)
library(MRInstruments)
library("geneplotter")
All in all this may take some time, good moment to relax, review your notes, stretch your legs, or take a coffee.
Are you ready?
Are you ready? Did you bring coffee and a good dose of energy? Let’s start!
Oh, one more thing: you can save your notebook, the one you just created, to keep all the R codes you are applying in the next chapters and add descriptions and notes. If you save this notebook you’ll notice that a html-file is created. This file is a legible webbrowser-friendly version of your work and contains the codes and the output (code messages, tables, and figures). And the nice thing is, that you can easily share it with others over email.
Ok. ’Nough said, let’s move on to cover some basics in Chapter ??.
Licenses and disclaimers
Copyright
This book and all its material (“content”) is protected by copyright under Dutch Copyright laws and is the property of the author or the party credited as the provider of the content. You may not copy, reproduce, distribute, publish, display, perform, modify, create derivative works, transmit, or in any way exploit any such content, nor may you distribute any part of this content over any network, including a local area network, sell or offer it for sale, or use such content to construct any kind of database. You may not alter or remove any copyright or other notice from copies of the content on this website. Copying or storing any content except as provided above is expressly prohibited without prior written permission of the author or the copyright holder identified in the individual content’s copyright notice. For permission to use this content, please contact the author.
Disclaimer
The content contained herein is provided only for educational and informational purposes or as required by Dutch law. The author attempted to ensure that content is accurate and obtained from reliable sources, but does not represent it to be error-free. The author may add, amend or repeal any text, procedure or regulation, and failure to timely post such changes to this book shall not be construed as a waiver of enforcement. The author does not warrant that any functions on this website or the contents and references herein will be uninterrupted, that defects will be corrected, or that this website or the contents and references will be free from viruses or other harmful components. Any links to third party information on the author’s website are provided as a courtesy and do not constitute an endorsement of those materials or the third party providing them.
Images and data used
I took the at-most care to use refer to the original works and data sources where needed. Likewise, all the images c.q. figures are either produced specifically for this book, or I took them from Unsplash to brighten up the book. If you feel I made a mistake and your work should be properly referenced, please don’t hesitate to contact me.
These are the images from Unsplash listed here in no particular order.
Colophon
The 2022 and 2024 editions of this book were produce in RStudio and with the bookdown package. Below a listing of installed programs and libraries, the operating system, and their specific versions.
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.3.3 (2024-02-29)
## os macOS Sonoma 14.5
## system x86_64, darwin20
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2024-04-04
## pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## askpass 1.2.0 2023-09-03 [2] CRAN (R 4.3.0)
## bookdown * 0.38.1 2024-03-26 [2] Github (rstudio/bookdown@50a1c1e)
## bslib 0.6.2 2024-03-22 [2] CRAN (R 4.3.2)
## cachem 1.0.8 2023-05-01 [2] CRAN (R 4.3.0)
## chromote 0.2.0 2024-02-12 [1] CRAN (R 4.3.2)
## cli 3.6.2 2023-12-11 [2] CRAN (R 4.3.0)
## colorspace 2.1-0 2023-01-23 [2] CRAN (R 4.3.0)
## crayon 1.5.2 2022-09-29 [2] CRAN (R 4.3.0)
## crul 1.4.0 2023-05-17 [2] CRAN (R 4.3.0)
## curl 5.2.1 2024-03-01 [2] CRAN (R 4.3.2)
## data.table 1.15.4 2024-03-30 [1] CRAN (R 4.3.2)
## digest 0.6.35 2024-03-11 [2] CRAN (R 4.3.2)
## evaluate 0.23 2023-11-01 [2] CRAN (R 4.3.0)
## fastmap 1.1.1 2023-02-24 [2] CRAN (R 4.3.0)
## flextable * 0.9.5 2024-03-06 [1] CRAN (R 4.3.2)
## fontBitstreamVera 0.1.1 2017-02-01 [2] CRAN (R 4.3.0)
## fontLiberation 0.1.0 2016-10-15 [2] CRAN (R 4.3.0)
## fontquiver 0.2.1 2017-02-01 [2] CRAN (R 4.3.0)
## formatR * 1.14 2023-01-17 [2] CRAN (R 4.3.0)
## gdtools 0.3.7 2024-03-05 [2] CRAN (R 4.3.2)
## gfonts 0.2.0 2023-01-08 [2] CRAN (R 4.3.0)
## glue 1.7.0 2024-01-09 [2] CRAN (R 4.3.0)
## htmltools 0.5.8 2024-03-25 [2] CRAN (R 4.3.2)
## httpcode 0.3.0 2020-04-10 [2] CRAN (R 4.3.0)
## httpuv 1.6.15 2024-03-26 [2] CRAN (R 4.3.2)
## jquerylib 0.1.4 2021-04-26 [2] CRAN (R 4.3.0)
## jsonlite 1.8.8 2023-12-04 [2] CRAN (R 4.3.0)
## kableExtra * 1.4.0 2024-01-24 [1] CRAN (R 4.3.2)
## knitr * 1.45 2023-10-30 [1] CRAN (R 4.3.0)
## later 1.3.2 2023-12-06 [2] CRAN (R 4.3.0)
## lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.3.0)
## magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.3.0)
## mime 0.12 2021-09-28 [2] CRAN (R 4.3.0)
## munsell 0.5.1 2024-04-01 [1] CRAN (R 4.3.2)
## officer 0.6.5 2024-02-24 [2] CRAN (R 4.3.2)
## openssl 2.1.1 2023-09-25 [2] CRAN (R 4.3.0)
## processx 3.8.4 2024-03-16 [2] CRAN (R 4.3.2)
## promises 1.2.1 2023-08-10 [2] CRAN (R 4.3.0)
## ps 1.7.6 2024-01-18 [2] CRAN (R 4.3.0)
## R6 2.5.1 2021-08-19 [2] CRAN (R 4.3.0)
## ragg 1.3.0 2024-03-13 [2] CRAN (R 4.3.2)
## Rcpp 1.0.12 2024-01-09 [2] CRAN (R 4.3.0)
## rlang 1.1.3 2024-01-10 [2] CRAN (R 4.3.0)
## rmarkdown * 2.26.1 2024-03-26 [2] Github (rstudio/rmarkdown@ee69d59)
## rstudioapi 0.16.0 2024-03-24 [2] CRAN (R 4.3.2)
## sass 0.4.9 2024-03-15 [2] CRAN (R 4.3.2)
## scales 1.3.0 2023-11-28 [2] CRAN (R 4.3.0)
## sessioninfo 1.2.2 2021-12-06 [2] CRAN (R 4.3.0)
## shiny 1.8.1 2024-03-26 [2] CRAN (R 4.3.2)
## stringi 1.8.3 2023-12-11 [2] CRAN (R 4.3.0)
## stringr 1.5.1 2023-11-14 [2] CRAN (R 4.3.0)
## svglite 2.1.3 2023-12-08 [1] CRAN (R 4.3.0)
## systemfonts 1.0.6 2024-03-07 [2] CRAN (R 4.3.2)
## textshaping 0.3.7 2023-10-09 [2] CRAN (R 4.3.0)
## tinytex * 0.50 2024-03-16 [2] CRAN (R 4.3.2)
## uuid 1.2-0 2024-01-14 [2] CRAN (R 4.3.0)
## viridisLite 0.4.2 2023-05-02 [2] CRAN (R 4.3.0)
## webshot * 0.5.5 2023-06-26 [1] CRAN (R 4.3.0)
## webshot2 * 0.1.1 2023-08-11 [1] CRAN (R 4.3.0)
## websocket 1.4.1 2021-08-18 [1] CRAN (R 4.3.0)
## xfun 0.43 2024-03-25 [2] CRAN (R 4.3.2)
## xml2 1.3.6 2023-12-04 [2] CRAN (R 4.3.0)
## xtable 1.8-4 2019-04-21 [2] CRAN (R 4.3.0)
## yaml 2.3.8 2023-12-11 [2] CRAN (R 4.3.0)
## zip 2.3.1 2024-01-27 [2] CRAN (R 4.3.2)
##
## [1] /Users/slaan3/Library/R/x86_64/4.3/library
## [2] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library
##
## ──────────────────────────────────────────────────────────────────────────────
References
--- 
title: "A Practical Primer in Human Complex Genetics"
subtitle: "with a use-case in cardiovascular disease"
author: "[dr. Sander W. van der Laan](https://vanderlaanand.science) [![](./img/_logo/twitter_circle_blue.png){width=2%}](https://www.twitter.com/swvanderlaan) [![](./img/_logo/email_circle_blue.png){width=2%}](mailto:s.w.vanderlaan@gmail.com)"
date: "Version 2.0.2 (2024-04-04)"
description: "This is a practical primer in human complex genetics with a use-case in cardiovascular disease. The output format for this primer is bookdown::gitbook."
documentclass: book
github-repo: swvanderlaan/A_Practical_Primer_in_Human_Complex_Genetics
link-citations: yes
bibliography:
- bibliography/book.bib
- bibliography/packages.bib
biblio-style: apalike
site: bookdown::bookdown_site
always_allow_html: yes
#cover-image: "images/cover.png"
#apple-touch-icon: "touch-icon.png"
#apple-touch-icon-size: 120
#favicon: "favicon.ico"
---

# About this primer

Ever since the first genome-wide association study (GWAS) on [age-related macular degeneration](https://doi.org/10.1126/science.1109557){target="_blank"}, and the promise of personalized medicine in the wake of the Human Genome Project, large-scale genetic association studies hold significant sway in contemporary health research and [drive drug-development pipelines](http://dx.doi.org/10.1038/nrd.2017.262){target="_blank"}. In the past 2 decades, researchers delved into GWAS, aiming to unveil genetic variations linked to both human traits, such as the color of your eyes, and rare and common complex diseases. These findings serve as crucial keys to unravel the intricate mechanisms underlying diseases, shedding light on whether the correlations identified in observational studies between risk factors and diseases are truly causal. 

These studies have ushered in an exciting era where many researchers thrive on developing new methods and bioinformatic tools to parse ever-growing large datasets collected large population-based biobanks. However, the analyses of these data are challenging and it can be daunting to see the forest for tree among the many tools and their various functions. Enter _A Practical Primer in Human Complex Genetics_. This [GitBook](https://cjvanlissa.github.io/gitbook-demo/){target="_blank"} was originally written back in 2022 for the **Genetic Epidemiology** course organized by the [Master Epidemiology](https://epidemiology-education.nl){target="_blank"} of Utrecht University. This practical guide will teach you how to design a GWAS, perform quality control (QC), execute the actual analyses, annotate the GWAS results, and perform further downstream post-GWAS analyses. Throughout the book you'll work with 'dummy', that is fake, data, but in the end, we will use real-world data from the first release of the [*Welcome Trust Case-Control Consortium (WTCCC)*](https://www.wtccc.org.uk/ccc1/overview.html){target="_blank"} focusing on coronary artery disease (CAD). 

A major component of modern-day GWAS is [genetic imputation](https://www.nature.com/articles/nrg2796){target="_blank"}, but for practical reasons it is not part of this book. However, I will provide some pointers as to how to go about do this with minimal coding or scripting experience. Likewise, the courses does not cover the aspects of meta-analyses of GWAS, but some excellent resources exist to which I will direct. As this practical primer evolves, these and other topics may find their place in this book. 
I should also point out that emphasis of this book is on it being a _practical primer_. It is intended to provide some practical guidance to doing GWAS, and while theory is important, I will not cover this. Again, some very useful and excellent work exists to which I will point you, but I really want you to learn - and understand the theory - by _doing_. 

So, although originally crafted as a companion for the course, this practical guide stands on its own as a comprehensive resource for diving into all facets of doing a GWAS — save for experimental follow-up, of course 😉.

I can imagine this seems overwhelming, but trust me, you'll be okay. Just follow this practical. You'll learn by doing and at the end of the day, you can execute a GWAS independently.

**Ready to start?**

<!-- Your first point of action is to prepare your system for this course in Chapter \@ref(somebackgroundreading). -->

<script>
title=document.getElementById('header');
title.innerHTML = '<img src="img/_headers/banner_man_standing_dna.png" alt="A Practical Primer in Human Complex Genetics">' + title.innerHTML
</script>

<!--chapter:end:index.Rmd-->

# Some background reading  {#somebackgroundreading}
<!-- ![](./img/_headers/papers_on_wall.png){width=100%} -->





Standing on the shoulders of giants, that's what this book and I do. I want to acknowledge some great work that has helped me tremendously and, really, this book wouldn't exist without this awesome work. So, I do want to give you some background reading. Is it a prerequisite? No, not really. For starters, the course covers most and you'll learn as you go. And if you didn't come here through the course, you'll be fine just the same. That said, it's a always good idea to get familiar with these works as you move forward on your path towards your first GWAS - in fact, I had these printed out with markings and writings all over them as I executed my first GWAS, and they've been great as a reference many times after. 

Large parts of this work are based on four awesome Nature Protocols from the [Zondervan group](https://www.well.ox.ac.uk/research/research-groups/zondervan-group){target="_blank"} at the Wellcome Center Human Genetics.

1. [Zondervan KT _et al._ *Designing candidate gene and genome-wide case-control association studies.* Nat Protoc 2007.](https://www.ncbi.nlm.nih.gov/pubmed/17947991){target="_blank"}
2. [Pettersson FH _et al._ *Marker selection for genetic case-control association studies.* Nat Protoc 2009.](https://www.ncbi.nlm.nih.gov/pubmed/19390530){target="_blank"}
3. [Anderson CA _et al._ *Data QC in genetic case-control association studies.* Nat Protoc 2010.](https://www.ncbi.nlm.nih.gov/pubmed/21085122){target="_blank"}
4. [Clarke GM _et al._ *Basic statistical analysis in genetic case-control studies.* Nat Protoc 2011.](https://www.ncbi.nlm.nih.gov/pubmed/21293453){target="_blank"}

An update on the community standards of QC for GWAS can be found here:

1. [Laurie CC _et al._ *Quality control and quality assurance in genotypic data for genome-wide association studies.* Genet Epidemiol 2010.](https://www.ncbi.nlm.nih.gov/pubmed/20718045){target="_blank"}

With respect to imputation and meta-analyses of GWAS you should also get familiar with the following two works:

1. [Marchini, J. and Howie, B. *Genotype imputation for genome-wide association studies.* Nat Rev Genet 2010](https://doi.org/10.1038/nrg2796){target="_blank"}
2. [de Bakker PIW _et al._ *Practical aspects of imputation-driven meta-analysis of genome-wide association studies.* Hum Mol Genet 2008.](https://www.ncbi.nlm.nih.gov/pubmed/18852200){target="_blank"}
3. [Winkler TW _et al._ *Quality control and conduct of genome-wide association meta-analyses.* Nat Protoc 2014.](https://www.ncbi.nlm.nih.gov/pubmed/24762786){target="_blank"}


**Are you ready?**

Are you ready? Did you bring coffee and a good dose of energy? Let's start! Your first point of action is to prepare your system for this course in Chapter \@ref(getting-started).

<!-- ```{js, echo = FALSE} -->
<!-- title=document.getElementById('header'); -->
<!-- title.innerHTML = '<img src="img/_headers/papers_on_wall.png" alt="Some background reading">' + title.innerHTML -->
<!-- ``` -->

<!--chapter:end:02_1_somebackgroundreading.Rmd-->

# Getting started {#getting-started}
<!-- ![](./img/_headers/women_behind_macbook.png){width=100%} -->






> [THIS CHAPTER NEEDS WORK]
>
> - introduction
> - briefly touch on operating system
> - split CoCalc vs standalone
>   - CoCalc: everything is installed
>   - standalone: 
>     - focus on macOS
>     - what to install
>     - show how to navigate on macOS Terminal
> [Some introductory text]

## Your computer

Before getting started, we need to discuss your computer. Most programs made to execute genetic epidemiology studies are developed for the Unix environment, for example Linux and macOS. So, they may not work as intended in a Windows environment. Windows does allow users to install a linux subsystem within Windows 10+ and you can find the detail [guide](https://docs.microsoft.com/en-us/windows/wsl/about){target="_blank"} here.  

However, I highly recommend one of two options. 

- One, install a linux subsystem on your Windows computer (for example [a virtual machine with Ubuntu could work](https://blog.storagecraft.com/the-dead-simple-guide-to-installing-a-linux-virtual-machine-on-windows/){target="_blank"}). 
- Two, switch to macOS in combination with [homebrew](https://brew.sh){target="_blank"}. This will give you all the flexibility to use Unix-based programs for your genetic epidemiology work and at the same time you'll keep the advantage of a powerful computer with a user-friendly interface.

I chose the latter. 

> For this practical every command is intended for Linux/macOS, in other words Unix-systems.

## CoCalc vs. Standalone

For the purpose of this practical primer there are one of two steps you need to take to get started. When you are following the course, you will want to read the section **CoCalc**. When you want to use this book as a standalone, you should check out the instructions in section **Standalone** - this is probably also the section you want to follow for real-world cases. 

But first, I'll briefly provide some background on the various programs that are commonly used.

## The programs we use

We'll use a few programs throughout this practical. You'll probably need these for your (future) genetic epidemiology work too (Table \@ref(tab:programs)).





```{=html}
<div class="tabwid"><style>.cl-29da0028{}.cl-29d3438c{font-family:'Helvetica';font-size:11pt;font-weight:normal;font-style:normal;text-decoration:none;color:rgba(0, 0, 0, 1.00);background-color:transparent;}.cl-29d65612{margin:0;text-align:left;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);padding-bottom:5pt;padding-top:5pt;padding-left:5pt;padding-right:5pt;line-height: 1;background-color:transparent;}.cl-29d66922{width:0.924in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 1.5pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-29d6692c{width:2.716in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 1.5pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-29d6692d{width:14.002in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 1.5pt solid rgba(102, 102, 102, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-29d66936{width:0.924in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-29d66937{width:2.716in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-29d66938{width:14.002in;background-color:transparent;vertical-align: middle;border-bottom: 0 solid rgba(0, 0, 0, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-29d66940{width:0.924in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-29d66941{width:2.716in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}.cl-29d66942{width:14.002in;background-color:transparent;vertical-align: middle;border-bottom: 1.5pt solid rgba(102, 102, 102, 1.00);border-top: 0 solid rgba(0, 0, 0, 1.00);border-left: 0 solid rgba(0, 0, 0, 1.00);border-right: 0 solid rgba(0, 0, 0, 1.00);margin-bottom:0;margin-top:0;margin-left:0;margin-right:0;}</style><table data-quarto-disable-processing='true' class='cl-29da0028'>

```

<caption style="display:table-caption;">(\#tab:programs)<span>Programs needed for genetic epidemiology.</span></caption>

```{=html}

<thead><tr style="overflow-wrap:break-word;"><th class="cl-29d66922"><p class="cl-29d65612"><span class="cl-29d3438c">Program</span></p></th><th class="cl-29d6692c"><p class="cl-29d65612"><span class="cl-29d3438c">Link</span></p></th><th class="cl-29d6692d"><p class="cl-29d65612"><span class="cl-29d3438c">Description</span></p></th></tr></thead><tbody><tr style="overflow-wrap:break-word;"><td class="cl-29d66936"><p class="cl-29d65612"><span class="cl-29d3438c">PLINK</span></p></td><td class="cl-29d66937"><p class="cl-29d65612"><span class="cl-29d3438c">https://www.cog-genomics.org/plink2/</span></p></td><td class="cl-29d66938"><p class="cl-29d65612"><span class="cl-29d3438c">PLINK is a free, open-source genetic analysis tool set, designed to perform a range of basic data parsing and quality control, as well as basic and large-scale analyses in a computationally efficient manner.</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-29d66936"><p class="cl-29d65612"><span class="cl-29d3438c">R</span></p></td><td class="cl-29d66937"><p class="cl-29d65612"><span class="cl-29d3438c">https://cran.r-project.org/</span></p></td><td class="cl-29d66938"><p class="cl-29d65612"><span class="cl-29d3438c">A program to perform statistical analysis and visualizations.</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-29d66936"><p class="cl-29d65612"><span class="cl-29d3438c">RStudio</span></p></td><td class="cl-29d66937"><p class="cl-29d65612"><span class="cl-29d3438c">https://www.rstudio.com</span></p></td><td class="cl-29d66938"><p class="cl-29d65612"><span class="cl-29d3438c">A user-friendly R-wrap-around for code editing, debugging, analyses, and visualization.</span></p></td></tr><tr style="overflow-wrap:break-word;"><td class="cl-29d66940"><p class="cl-29d65612"><span class="cl-29d3438c">Homebrew</span></p></td><td class="cl-29d66941"><p class="cl-29d65612"><span class="cl-29d3438c">https://brew.sh</span></p></td><td class="cl-29d66942"><p class="cl-29d65612"><span class="cl-29d3438c">A great extension for Mac-users to install really useful programs that Apple didn't.</span></p></td></tr></tbody></table></div>
```

### RStudio
**RStudio** is a very user-friendly interface around `R` that makes your `R`-scripting-life a lot easier. You should get used to that. **RStudio** comes with `R` so you don't have to worry about that.

### PLINK
Right, onto `PLINK`. 

All genetic analyses can be done in PLINK, even on your laptop, but with large datasets, for example [UK Biobank](https://www.ukbiobank.ac.uk){target="_blank"} size, it is better to switch to a [high-performance computing cluster (HPC)](https://en.wikipedia.org/wiki/High-performance_computing){target="_blank"} like we have available at the [Utrecht Science Park](https://wiki.bioinformatics.umcutrecht.nl/bin/view/HPC/WebHome){target="_blank"}. The original PLINK v1.07 can be found [here](https://zzz.bwh.harvard.edu/plink/index.shtml){target="_blank"}, but nowadays we are using a newer, faster version: **PLINK v1.9** which can be found [here](https://www.cog-genomics.org/plink2){target="_blank"}. It still says 'PLINK 1.90 beta' (Figure \@ref(fig:plinkprogram)), but you can consider this version stable and save to work with, but as you can see, some functions are not supported anymore.

<div class="figure" style="text-align: center">
<img src="img/plink.png" alt="The PLINK v1.9 website." width="85%" />
<p class="caption">(\#fig:plinkprogram)The PLINK v1.9 website.</p>
</div>


### Alternatives to `PLINK`
Nowadays, a lot of people also use programs like [SNPTEST](snptest){target="_blank"}, [BOLT-LMM](https://data.broadinstitute.org/alkesgroup/BOLT-LMM/){target="_blank"},  [GCTA](http://cnsgenomics.com/software/gcta/#Overview){target="_blank"}, or [regenie](https://rgcgithub.github.io/regenie/){target="_blank"} as alternatives to execute GWAS. These programs were designed with specific use-cases in mind, for instance really large biobank data including hundreds of thousands individuals, better control for population stratification, the ability to estimate trait heritability or Fst, and so on.

### Other programs
Mendelian randomization can be done either with the [SMR](http://cnsgenomics.com/software/smr/#Overview){target="_blank"} or [GSMR](http://cnsgenomics.com/software/gsmr/){target="_blank"} function from GCTA, or with R-packages, like [`TwoSampleMR`](https://mrcieu.github.io/TwoSampleMR/){target="_blank"}.


## CoCalc

> [ TEXT NEEDS UPDATING]

Now, pay attention. If you came here through the course **Genetic Epidemiology**, you don't have to do anything. All the data you need are already downloaded. 

However, when you are using this book as a standalone, you'll need to start by downloading the data you need for this practical to your Desktop. 

For the course we set up a CoCalc Server and everything should be fine; we installed everything you need. 

## Standalone

So, you plan to use this book as 'Standalone' on a macOS environment. This means you'll need to install a few things first.

### The data you need

You'll need to start by downloading the data you need for this practical to your Desktop. 

Here's the link to the data. 

[Link to Google Drive with data](https://drive.google.com/drive/folders/1iDLB1y534DfgEZNPCYBrIj5X7g_XlBba?usp=share_link){target="_blank"}

Make sure you put the data in the `~/Desktop/practical/` folder.

The data are pretty large (approx. 15Gb), so this will take a minute or two depending on your internet connection. Time to stretch your legs or grab a coffee (data scientists don't drink tea). 

### Terminal 

For all the programs we use, except **RStudio**, you will need the **Terminal**. This comes with every major operating system; on Windows it is called 'PowerShell', but let's not go there. And regardless, you will (have to start to) make your own scripts. The benefit of using scripts is that each step in your workflow is clearly stipulated and annotated, and it allows for greater reproducibility, easier troubleshooting, and scaling up to high-performance computer clusters.

Open the **Terminal**, it should be on the left in the toolbar as a little black computer-monitor-like icon. Mac users can type `command + space` and type `terminal`, a **Terminal** screen should open.

> From now on we will use little code blocks like the example to indicate a code you should type/copy-paste and hit enter. If a code is followed by a comment, it is indicated by a # - you don't need to copy-paste and execute this.

```
CODE BLOCK

CODE BLOCK # some comment here
```

### Navigating the Terminal

You can navigate around the computer through the terminal by typing `cd <path>`; `cd` stands for "change directory" and `<path>` means "some_file_directory_you_want_to_go_to".


This command will bring you to your home directory.

```
cd ~ 
```

This will bring you to the parent directory (up one level).

```
cd ../ 
```

This will bring you to the XXX directory.

```
cd XXX 
```


Let's navigate to the folder you just downloaded.

```
cd ~/Desktop/practical
```


Let's check out what is inside the directory, by listing (`ls`) its contents.

This command shows files as list; the `-l` makes it a vertical list and adds more information, you can also remove it and simply type `ls` - go on, and try.

```
ls -l 
```


This command shows files as list with human readable format.

```
ls -lh 
```
Adding the flags `-lh` will get you the contents of a directory in a list (`-l`) and make the size 'human-readable' (`-h`).

Adding `-t` shows the files as list sorted by time edited.

```
ls -lt 
```

Adding `-S` shows the files as list sorted by size.

```
ls -lS 
```

You can also count the number of files. Just 'pipe' the result from `ls` to the next program `wc` ('wordcount') and list the number of lines, `-l`. In this case `-l` is a flag used by `wc` and it has a different meaning than it does for `ls`. 

```
ls | wc -l
```

And if you want to know all the function of a program simply type the following.

```
man ls
```

This will take you to a manual of the program with an extensive description of each flag (Figure \@ref(fig:lsmanual)).

<div class="figure" style="text-align: center">
<img src="img/ls_manual.png" alt="Partial output from the ls manual." width="85%" />
<p class="caption">(\#fig:lsmanual)Partial output from the ls manual.</p>
</div>

### Installing the software

#### brew

Linux has a great package-manager that is lacking on macOS. You can install [`brew`](https://brew.sh){target="_blank"} to compensate for this. This adds the ability to install almost any Linux-based program through the **Terminal** such as `wget`, `llvm`, etc. 

Open **Terminal** and execute the following:

```
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
```

Check if everything is in order.

```
brew doctor
```

It shouldn't report any errors.

#### PLINK

First, we'll get `PLINK`. Navigate to the **PLINK v1.9** website, which can be found [here](https://www.cog-genomics.org/plink2){target="_blank"}. Download the macOS (64-bit) version under 'Stable (beta x.x, day month year)'. 

> Note: Apple produced Intel-based computers for a few years back, and most programs, packages, libraries and whatnot are designed for that. So, I highly recommend using software designed for that and activating Rosetta2 in your Terminal. Don't know how to do that? Following [these instructions](https://support.apple.com/en-us/102527){target="_blank"}.

Unzip the folder and put `plink` in the practical folder. 

```
mv -v ~/Downloads/plink_mac_20231211/plink ~/Desktop/practical/plink 
```

#### Installing R and RStudio

Let's go ahead and use `brew` to install the `R` and **RStudio** software.

In **Terminal** execute the following and just follow the instructions.

```
brew install rstudio
brew install --cask r
```

Now close the terminal window - really make sure that the terminal-program has quit.

Open your fresh installation of **RStudio** by double clicking the icon. You should be seeing something like figure \@ref(fig:rstudioscreenshot)

<div class="figure" style="text-align: center">
<img src="img/rstudio-screenshot.png" alt="RStudio screenshot." width="85%" />
<p class="caption">(\#fig:rstudioscreenshot)RStudio screenshot.</p>
</div>


In the top right, you see a little green-white plus-sign, click this and select 'R Notebook' (Figure \@ref(fig:rstudioscreenshotcreatenotebook)). 

<div class="figure" style="text-align: center">
<img src="img/rstudio-screenshot-create-notebook.png" alt="RStudio screenshot." width="85%" />
<p class="caption">(\#fig:rstudioscreenshotcreatenotebook)RStudio screenshot.</p>
</div>

You will create an untitled (`Untitled1`) `R` notebook: you can combine text descriptions, like you would in a lab-journal, with code-sections. Read what is in the notebook to get a grasp on that (Figure \@ref(fig:rstudioscreenshotnotebook)). 

<div class="figure" style="text-align: center">
<img src="img/rstudio-screenshot-notebook.png" alt="RStudio screenshot." width="85%" />
<p class="caption">(\#fig:rstudioscreenshotnotebook)RStudio screenshot.</p>
</div>

Right, you should be installing some packages. To do so, you can remove `plot(cars)` (or leave and create a new code-block as per instructions in the notebook), and copy paste the code below. Make sure to put in a code block like the example in which `plot(cars)` is in.

```
remotes::install_github(c("rstudio/rmarkdown"))

install.packages(c("formatR", "remotes", 
                   "httr", "usethis", 
                   "data.table", "devtools", 
                   "dplyr", "tibble", "tidyverse", 
                   "openxlsx",
                   "ggplot2",
                   "ggsci", "ggthemes",
                   "qqman", "CMplot", "plotly", 
                   "openxlsx"))
devtools::install_github("kassambara/ggpubr")

devtools::install_github("oliviasabik/RACER")

remotes::install_github("MRCIEU/TwoSampleMR")
devtools::install_github("MRCIEU/MRInstruments")

if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("geneplotter")
```

You should load these packages too. 

```
library(rmarkdown)
library(formatR)

library(openxlsx)

library(data.table)

library(tibble)
library(tidyverse)
library(dplyr)
library(plotly)

library(ggplot2)
library(devtools)
library(ggpubr)
library(ggsci)
library(ggthemes)

library(qqman)
library(CMplot)
library(RACER)

library(remotes)
library(TwoSampleMR)
library(MRInstruments)

library("geneplotter")
```

All in all this may take some time, good moment to relax, review your notes, stretch your legs, or take a coffee.


## Are you ready?

Are you ready? Did you bring coffee and a good dose of energy? Let's start! 

Oh, one more thing: you can save your notebook, the one you just created, to keep all the `R` codes you are applying in the next chapters and add descriptions and notes. If you save this notebook you'll notice that a `html`-file is created. This file is a legible webbrowser-friendly version of your work and contains the codes and the output (code messages, tables, and figures). And the nice thing is, that you can easily share it with others over email. 

Ok. 'Nough said, let's move on to cover some basics in Chapter \@ref(gwas-basics).

<!-- ```{js, echo = FALSE} -->
<!-- title=document.getElementById('header'); -->
<!-- title.innerHTML = '<img src="img/_headers/women_behind_macbook.png" alt="Getting started">' + title.innerHTML -->
<!-- ``` -->

<!--chapter:end:02_2_gettingstarted.Rmd-->

# Licenses and disclaimers {#license}
<!-- ![](img/_headers/licenses.png){width=100%} -->





## Copyright

This book and all its material ("content") is protected by copyright under Dutch Copyright laws and is the property of the author or the party credited as the provider of the content. You may not copy, reproduce, distribute, publish, display, perform, modify, create derivative works, transmit, or in any way exploit any such content, nor may you distribute any part of this content over any network, including a local area network, sell or offer it for sale, or use such content to construct any kind of database. You may not alter or remove any copyright or other notice from copies of the content on this website. Copying or storing any content except as provided above is expressly prohibited without prior written permission of the author or the copyright holder identified in the individual content's copyright notice. For permission to use this content, please contact the author.

## Disclaimer

The content contained herein is provided only for educational and informational purposes or as required by Dutch law. The author attempted to ensure that content is accurate and obtained from reliable sources, but does not represent it to be error-free. The author may add, amend or repeal any text, procedure or regulation, and failure to timely post such changes to this book shall not be construed as a waiver of enforcement. The author does not warrant that any functions on this website or the contents and references herein will be uninterrupted, that defects will be corrected, or that this website or the contents and references will be free from viruses or other harmful components. Any links to third party information on the author’s website are provided as a courtesy and do not constitute an endorsement of those materials or the third party providing them.

## Images and data used

I took the at-most care to use refer to the original works and data sources where needed. Likewise, all the images c.q. figures are either produced specifically for this book, or I took them from [**Unsplash**](https://unsplash.com/s/photos/legal) to brighten up the book. If you feel I made a mistake and your work should be properly referenced, please don't hesitate to contact me. 

These are the images from **Unsplash** listed here in no particular order.

- papers_on_wall - https://unsplash.com/photos/open-book-lot-Oaqk7qqNh_c
- women_behind_macbook - https://unsplash.com/photos/woman-using-macbook-pro-with-person-in-white-top-bPVM4nOy0Rg
- woman_working_on_code - https://unsplash.com/photos/woman-wearing-black-t-shirt-holding-white-computer-keyboard-YK0HPwWDJ1I
- licenses - https://unsplash.com/photos/book-lot-on-black-wooden-shelf-zeH-ljawHtg

## Copyright

Copyright 1979-2024. All rights reserved. Sander W. van der Laan | [s.w.vanderlaan [at] gmail.com](mailto:s.w.vanderlaan@gmail.com) | [https://vanderlaanand.science](https://vanderlaanand.science){target="_blank"}. Published with [`bookdown`](https://bookdown.org/yihui/bookdown/){target="_blank"}.

<!-- ```{js, echo = FALSE} -->
<!-- title=document.getElementById('header'); -->
<!-- title.innerHTML = '<img src="img/_headers/banner_man_standing_dna.png" alt="Licenses">' + title.innerHTML -->
<!-- ``` -->

<!--chapter:end:11_licenses.Rmd-->

# Colophon





The 2022 and 2024 editions of this book were produce in RStudio and with the `bookdown` package. Below a listing of installed programs and libraries, the operating system, and their specific versions.


```
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.3 (2024-02-29)
##  os       macOS Sonoma 14.5
##  system   x86_64, darwin20
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       America/New_York
##  date     2024-04-04
##  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package           * version date (UTC) lib source
##  askpass             1.2.0   2023-09-03 [2] CRAN (R 4.3.0)
##  bookdown          * 0.38.1  2024-03-26 [2] Github (rstudio/bookdown@50a1c1e)
##  bslib               0.6.2   2024-03-22 [2] CRAN (R 4.3.2)
##  cachem              1.0.8   2023-05-01 [2] CRAN (R 4.3.0)
##  chromote            0.2.0   2024-02-12 [1] CRAN (R 4.3.2)
##  cli                 3.6.2   2023-12-11 [2] CRAN (R 4.3.0)
##  colorspace          2.1-0   2023-01-23 [2] CRAN (R 4.3.0)
##  crayon              1.5.2   2022-09-29 [2] CRAN (R 4.3.0)
##  crul                1.4.0   2023-05-17 [2] CRAN (R 4.3.0)
##  curl                5.2.1   2024-03-01 [2] CRAN (R 4.3.2)
##  data.table          1.15.4  2024-03-30 [1] CRAN (R 4.3.2)
##  digest              0.6.35  2024-03-11 [2] CRAN (R 4.3.2)
##  evaluate            0.23    2023-11-01 [2] CRAN (R 4.3.0)
##  fastmap             1.1.1   2023-02-24 [2] CRAN (R 4.3.0)
##  flextable         * 0.9.5   2024-03-06 [1] CRAN (R 4.3.2)
##  fontBitstreamVera   0.1.1   2017-02-01 [2] CRAN (R 4.3.0)
##  fontLiberation      0.1.0   2016-10-15 [2] CRAN (R 4.3.0)
##  fontquiver          0.2.1   2017-02-01 [2] CRAN (R 4.3.0)
##  formatR           * 1.14    2023-01-17 [2] CRAN (R 4.3.0)
##  gdtools             0.3.7   2024-03-05 [2] CRAN (R 4.3.2)
##  gfonts              0.2.0   2023-01-08 [2] CRAN (R 4.3.0)
##  glue                1.7.0   2024-01-09 [2] CRAN (R 4.3.0)
##  htmltools           0.5.8   2024-03-25 [2] CRAN (R 4.3.2)
##  httpcode            0.3.0   2020-04-10 [2] CRAN (R 4.3.0)
##  httpuv              1.6.15  2024-03-26 [2] CRAN (R 4.3.2)
##  jquerylib           0.1.4   2021-04-26 [2] CRAN (R 4.3.0)
##  jsonlite            1.8.8   2023-12-04 [2] CRAN (R 4.3.0)
##  kableExtra        * 1.4.0   2024-01-24 [1] CRAN (R 4.3.2)
##  knitr             * 1.45    2023-10-30 [1] CRAN (R 4.3.0)
##  later               1.3.2   2023-12-06 [2] CRAN (R 4.3.0)
##  lifecycle           1.0.4   2023-11-07 [2] CRAN (R 4.3.0)
##  magrittr            2.0.3   2022-03-30 [2] CRAN (R 4.3.0)
##  mime                0.12    2021-09-28 [2] CRAN (R 4.3.0)
##  munsell             0.5.1   2024-04-01 [1] CRAN (R 4.3.2)
##  officer             0.6.5   2024-02-24 [2] CRAN (R 4.3.2)
##  openssl             2.1.1   2023-09-25 [2] CRAN (R 4.3.0)
##  processx            3.8.4   2024-03-16 [2] CRAN (R 4.3.2)
##  promises            1.2.1   2023-08-10 [2] CRAN (R 4.3.0)
##  ps                  1.7.6   2024-01-18 [2] CRAN (R 4.3.0)
##  R6                  2.5.1   2021-08-19 [2] CRAN (R 4.3.0)
##  ragg                1.3.0   2024-03-13 [2] CRAN (R 4.3.2)
##  Rcpp                1.0.12  2024-01-09 [2] CRAN (R 4.3.0)
##  rlang               1.1.3   2024-01-10 [2] CRAN (R 4.3.0)
##  rmarkdown         * 2.26.1  2024-03-26 [2] Github (rstudio/rmarkdown@ee69d59)
##  rstudioapi          0.16.0  2024-03-24 [2] CRAN (R 4.3.2)
##  sass                0.4.9   2024-03-15 [2] CRAN (R 4.3.2)
##  scales              1.3.0   2023-11-28 [2] CRAN (R 4.3.0)
##  sessioninfo         1.2.2   2021-12-06 [2] CRAN (R 4.3.0)
##  shiny               1.8.1   2024-03-26 [2] CRAN (R 4.3.2)
##  stringi             1.8.3   2023-12-11 [2] CRAN (R 4.3.0)
##  stringr             1.5.1   2023-11-14 [2] CRAN (R 4.3.0)
##  svglite             2.1.3   2023-12-08 [1] CRAN (R 4.3.0)
##  systemfonts         1.0.6   2024-03-07 [2] CRAN (R 4.3.2)
##  textshaping         0.3.7   2023-10-09 [2] CRAN (R 4.3.0)
##  tinytex           * 0.50    2024-03-16 [2] CRAN (R 4.3.2)
##  uuid                1.2-0   2024-01-14 [2] CRAN (R 4.3.0)
##  viridisLite         0.4.2   2023-05-02 [2] CRAN (R 4.3.0)
##  webshot           * 0.5.5   2023-06-26 [1] CRAN (R 4.3.0)
##  webshot2          * 0.1.1   2023-08-11 [1] CRAN (R 4.3.0)
##  websocket           1.4.1   2021-08-18 [1] CRAN (R 4.3.0)
##  xfun                0.43    2024-03-25 [2] CRAN (R 4.3.2)
##  xml2                1.3.6   2023-12-04 [2] CRAN (R 4.3.0)
##  xtable              1.8-4   2019-04-21 [2] CRAN (R 4.3.0)
##  yaml                2.3.8   2023-12-11 [2] CRAN (R 4.3.0)
##  zip                 2.3.1   2024-01-27 [2] CRAN (R 4.3.2)
## 
##  [1] /Users/slaan3/Library/R/x86_64/4.3/library
##  [2] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library
## 
## ──────────────────────────────────────────────────────────────────────────────
```

<!-- ```{js, echo = FALSE} -->
<!-- title=document.getElementById('header'); -->
<!-- title.innerHTML = '<img src="img/_headers/banner_man_standing_dna.png" alt="Colofon">' + title.innerHTML -->
<!-- ``` -->

<!-- example: https://yearbookdiscoveries.com/wp-content/uploads/2014/05/Writing_a_Yearbook_Colophon.pdf -->
<!-- SPECIAL THANKS: (The staff wrote a few paragraphs about the year and mentioned a variety -->
<!-- of people who were instrumental in the success of their yearbook.) -->
<!-- COVER & ENDSHEETS: The 2013 Pinnacle cover is a four-color lithograph. An iridescent foil -->
<!-- covers a portion of the theme design. The endsheets are standard stock paper. The theme -->
<!-- concept was created and expanded by the editorial team and members of the 2013 Pinnacle -->
<!-- staff. Cover and endsheets were designed by Pinnacle co-editors-in-chief Regan Brown and -->
<!-- Ellena Sullivan, with inspiration provided by an early design from co-reference editor -->
<!-- Amanda Farrer. -->
<!-- TYPE & COLOR TREATMENT: Body copy throughout the book is set in Frutiger Light -->
<!-- Condensed (8.5 pt.) Captions are set in Frutiger Light Condensed (7.5 pt.) Headline -->
<!-- treatments are designed with variations of AHJ Nashville, Arno Pro and Frutiger. Photo -->
<!-- credits and spread credits appear in Frutiger Italic (6 pt.) -->
<!-- For consistency, a color palette was chosen. In addition to the traditional black, the -->
<!-- following colors appear throughout the publication: Pantone 151C, Pantone 3005C, Panton -->
<!-- 2985C, Pantone 115C, Pantone 376C, Pantone 363C, Pantone 266C, Pantone 185C and -->
<!-- Pantone Cool Grey 7C. -->
<!-- PUBLISHING: Volume 107 of the Pinnacle was designed and produced by the 2013 Pinnacle -->
<!-- staff. The 456-page, all-color Pinnacle is printed on 80 lb. gloss paper by Herff Jones -->
<!-- Publishing Co. in Kansas City, MO. Approximately 3,000 copies were pre-ordered for $52. -->
<!-- Any extra copies were sold for $60. A 48-page supplement was included in this price. The -->
<!-- publication was created using Adobe CS5.5 software on 42 Macintosh desktop and laptop -->
<!-- computers. -->
<!-- PHOTOGRAPHY: Pinnacle staff photographers shot digital photos using four Nikon D70s, -->
<!-- two Nikon D80s and one Nikon D40. Sport group photos were shot by Prestige Portraits, -->
<!-- and club and group photos were shot by both Prestige Portraits and Pinnacle yearbook staff -->
<!-- photographers. Some submitted photos appear throughout the book as well. -->
<!-- EDITORS’ NOTE: (A special note from the co-editors-in-chief was included here. The yearbook -->
<!-- staff photo with names and staff positions was included on the spread with the colophon.) -->

<!--chapter:end:12_colophon.Rmd-->


# References {-}


<!--chapter:end:13_references.Rmd-->

